{
 "cells": [
  {
   "cell_type": "markdown",
   "source": [
    "<a href=\"https://colab.research.google.com/github/milvus-io/bootcamp/blob/master/bootcamp/RAG/advanced_rag/query_routing_with_langchain.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n",
    "\n",
    "# Query routing"
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "## Google Colab preparation[optional]\n",
    "This is an optional step, if you want to run this notebook on Google Colab."
   ],
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%% md\n"
    }
   }
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "outputs": [],
   "source": [
    "! git clone https://github.com/milvus-io/bootcamp.git"
   ],
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   }
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "outputs": [],
   "source": [
    "import shutil\n",
    "src_dir = \"./bootcamp/bootcamp/RAG/advanced_rag/rag_utils\"\n",
    "dst_dir = \"./rag_utils\"\n",
    "shutil.copytree(src_dir, dst_dir)"
   ],
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   }
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "outputs": [],
   "source": [
    "! pip install --upgrade langchain langchain-community langchain_milvus langchain-openai bs4"
   ],
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "Please prepare you [OPENAI_API_KEY](https://openai.com/index/openai-api/) in your environment variables.\n",
    "![](imgs/colab_api_key1.png)"
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "### If you are running this notebook on Google Colab, you have to restart this session by `Cmd/Ctrl + M`, then press `.` to make the environment take effect."
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "outputs": [],
   "source": [
    "from google.colab import userdata\n",
    "import os\n",
    "\n",
    "os.environ['OPENAI_API_KEY'] = userdata.get('OPENAI_API_KEY')"
   ],
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "----\n",
    "## Get started\n",
    "![](imgs/query_routing.png)"
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "## Prepare the data\n",
    "\n",
    "We use the Langchain WebBaseLoader to load documents from [blog sources](https://lilianweng.github.io/posts/2023-06-23-agent/) and split them into chunks using the RecursiveCharacterTextSplitter."
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    },
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "import bs4\n",
    "from langchain_community.document_loaders import WebBaseLoader\n",
    "from langchain_text_splitters import RecursiveCharacterTextSplitter\n",
    "\n",
    "from rag_utils.vanilla import vectorstore\n",
    "\n",
    "# Create a WebBaseLoader instance to load documents from web sources\n",
    "loader = WebBaseLoader(\n",
    "    web_paths=(\"https://lilianweng.github.io/posts/2023-06-23-agent/\",),\n",
    "    bs_kwargs=dict(\n",
    "        parse_only=bs4.SoupStrainer(\n",
    "            class_=(\"post-content\", \"post-title\", \"post-header\")\n",
    "        )\n",
    "    ),\n",
    ")\n",
    "# Load documents from web sources using the loader\n",
    "documents = loader.load()\n",
    "# Initialize a RecursiveCharacterTextSplitter for splitting text into chunks\n",
    "text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n",
    "\n",
    "# Split the documents into chunks using the text_splitter\n",
    "docs = text_splitter.split_documents(documents)"
   ]
  },
  {
   "cell_type": "markdown",
   "source": [
    "## Build the chain\n",
    "\n",
    "We load the docs into milvus vectorstore, and build a milvus retriever."
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "vectorstore.add_documents(docs)\n",
    "retriever = vectorstore.as_retriever()"
   ]
  },
  {
   "cell_type": "markdown",
   "source": [
    "Build a router chain, and try to invoke it. It can return a string that classifies whether the query is decomposable."
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    },
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'Decomposable\\nReason: The question can be decomposed into two sub-questions: \"How can I use Milvus?\" and \"What is Zilliz?\".'"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from langchain_core.output_parsers import StrOutputParser\n",
    "from langchain_core.runnables import RunnablePassthrough\n",
    "from langchain_core.prompts import PromptTemplate\n",
    "from rag_utils.vanilla import llm\n",
    "from rag_utils.route import ROUTER_PROMPT\n",
    "\n",
    "router_chain = (\n",
    "    {\"question\": RunnablePassthrough()}\n",
    "    | PromptTemplate.from_template(ROUTER_PROMPT)\n",
    "    | llm\n",
    "    | StrOutputParser()\n",
    ")\n",
    "\n",
    "router_chain.invoke(\"How can I use Milvus and what is the zilliz\")"
   ]
  },
  {
   "cell_type": "markdown",
   "source": [
    "Define the vanilla RAG chain."
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain_core.output_parsers import StrOutputParser\n",
    "from langchain_core.runnables import RunnablePassthrough\n",
    "from rag_utils.vanilla import format_docs, rag_prompt, llm\n",
    "\n",
    "vanilla_rag_chain = (\n",
    "    {\"context\": retriever | format_docs, \"question\": RunnablePassthrough()}\n",
    "    | rag_prompt\n",
    "    | llm\n",
    "    | StrOutputParser()\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "source": [
    "Define the sub query chain."
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    },
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "from rag_utils.sub_query import SubQueryRetriever\n",
    "from langchain_core.runnables import RunnablePassthrough\n",
    "\n",
    "sub_query_retriever = SubQueryRetriever.from_vectorstore(vectorstore)\n",
    "\n",
    "sub_query_chain = (\n",
    "    {\"context\": sub_query_retriever | format_docs, \"question\": RunnablePassthrough()}\n",
    "    | rag_prompt\n",
    "    | llm\n",
    "    | StrOutputParser()\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "source": [
    "Define a route function."
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    },
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "from rag_utils.route import parse_router_output\n",
    "\n",
    "\n",
    "def route(info):\n",
    "    if parse_router_output(info[\"category\"]) == \"Decomposable\":\n",
    "        print(\"invoke sub_query_chain...\")\n",
    "        return RunnableLambda(lambda x: x[\"question\"]) | sub_query_chain\n",
    "    else:  # Independent\n",
    "        print(\"invoke vanilla_rag_chain...\")\n",
    "        return RunnableLambda(lambda x: x[\"question\"]) | vanilla_rag_chain"
   ]
  },
  {
   "cell_type": "markdown",
   "source": [
    "Let's define the full chain."
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    },
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "from langchain_core.runnables import RunnablePassthrough, RunnableLambda\n",
    "\n",
    "full_chain = {\n",
    "    \"category\": router_chain,\n",
    "    \"question\": RunnablePassthrough(),\n",
    "} | RunnableLambda(route)"
   ]
  },
  {
   "cell_type": "markdown",
   "source": [
    "## Test the chain\n"
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "invoke sub_query_chain...\n",
      "sub_queries: ['What are the different types of memory?', 'What are the different types of ANN algorithms?']\n",
      "\n",
      "\n",
      " The different types of memory include:\n",
      "\n",
      "1. Sensory Memory: This is the earliest stage of memory, providing the ability to retain impressions of sensory information after the original stimuli have ended. It typically lasts for up to a few seconds. Subcategories include iconic memory (visual), echoic memory (auditory), and haptic memory (touch).\n",
      "\n",
      "2. Short-Term Memory (STM) or Working Memory: It stores information that we are currently aware of and needed to carry out complex cognitive tasks such as learning and reasoning. Short-term memory is believed to have the capacity of about 7 items and lasts for 20-30 seconds.\n",
      "\n",
      "3. Long-Term Memory (LTM): Long-term memory can store information for a remarkably long time, ranging from a few days to decades, with an essentially unlimited storage capacity. There are two subtypes of LTM:\n",
      "   - Explicit / declarative memory: This is memory of facts and events, and refers to those memories that can be consciously recalled.\n",
      "   - Implicit / procedural memory: This type of memory is unconscious and involves skills and routines that are performed automatically.\n",
      "\n",
      "The different types of Approximate Nearest Neighbors (ANN) algorithms include:\n",
      "\n",
      "1. LSH (Locality-Sensitive Hashing): It introduces a hashing function such that similar input items are mapped to the same buckets with high probability.\n",
      "\n",
      "2. ANNOY (Approximate Nearest Neighbors Oh Yeah): The core data structure are random projection trees, a set of binary trees where each non-leaf node represents a hyperplane splitting the input space into half and each leaf stores one data point.\n",
      "\n",
      "3. FAISS (Facebook AI Similarity Search): It operates on the assumption that in high dimensional space, distances between nodes follow a Gaussian distribution and thus there should exist clustering of data points.\n",
      "\n",
      "4. ScaNN (Scalable Nearest Neighbors): The main innovation in ScaNN is anisotropic vector quantization. It quantizes a data point to such that the inner product is as similar to the original distance as possible, instead of picking the closet quantization centroid points.\n"
     ]
    }
   ],
   "source": [
    "query1 = \"Which are the different types of memory and different types ANN algorithms?\"\n",
    "\n",
    "print(\"\\n\\n\", full_chain.invoke(query1))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The different types of memory include:\n",
      "\n",
      "1. Sensory Memory: This is the earliest stage of memory, retaining impressions of sensory information for up to a few seconds. It includes iconic (visual), echoic (auditory), and haptic (touch) memory.\n",
      "\n",
      "2. Short-Term Memory (STM) or Working Memory: This stores information that we are currently aware of and needed for complex cognitive tasks. It has a capacity of about 7 items and lasts for 20-30 seconds.\n",
      "\n",
      "3. Long-Term Memory (LTM): This can store information for a remarkably long time, ranging from a few days to decades, with an essentially unlimited storage capacity. It includes explicit/declarative memory (facts and events that can be consciously recalled) and implicit/procedural memory (skills and routines performed automatically).\n",
      "\n",
      "As for ANN algorithms for fast Maximum Inner Product Search (MIPS), the common choice is the approximate nearest neighbors (ANN) algorithm. This algorithm returns approximately the top k nearest neighbors, trading off a little accuracy for a significant speedup.\n"
     ]
    }
   ],
   "source": [
    "print(vanilla_rag_chain.invoke(query1))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    },
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "invoke vanilla_rag_chain...\n",
      "\n",
      "\n",
      " The different types of memory include:\n",
      "\n",
      "1. Sensory Memory: This is the earliest stage of memory, retaining impressions of sensory information for up to a few seconds. It includes iconic (visual), echoic (auditory), and haptic (touch) memory.\n",
      "\n",
      "2. Short-Term Memory (STM) or Working Memory: This type of memory stores information that we are currently aware of and is needed for complex cognitive tasks. It has a capacity of about 7 items and lasts for 20-30 seconds.\n",
      "\n",
      "3. Long-Term Memory (LTM): This type of memory can store information for a remarkably long time, ranging from a few days to decades, with an essentially unlimited storage capacity. It has two subtypes:\n",
      "   - Explicit / declarative memory: This is memory of facts and events that can be consciously recalled. It includes episodic memory (events and experiences) and semantic memory (facts and concepts).\n",
      "   - Implicit / procedural memory: This type of memory is unconscious and involves skills and routines that are performed automatically.\n",
      "\n",
      "4. Memory Stream: This is a long-term memory module that records a comprehensive list of an agent’s experiences in natural language.\n",
      "\n",
      "These types of memory work together to help us acquire, store, retain, and retrieve information.\n"
     ]
    }
   ],
   "source": [
    "query2 = \"Which are the different types of memory?\"\n",
    "\n",
    "print(\"\\n\\n\", full_chain.invoke(query2))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The different types of memory include:\n",
      "\n",
      "1. Sensory Memory: This is the earliest stage of memory, retaining impressions of sensory information for up to a few seconds. It includes iconic memory (visual), echoic memory (auditory), and haptic memory (touch).\n",
      "\n",
      "2. Short-Term Memory (STM) or Working Memory: This type of memory stores information that we are currently aware of and is needed for complex cognitive tasks. It has a capacity of about 7 items and lasts for 20-30 seconds.\n",
      "\n",
      "3. Long-Term Memory (LTM): This type of memory can store information for a remarkably long time, ranging from a few days to decades, with an essentially unlimited storage capacity. It has two subtypes:\n",
      "   - Explicit / declarative memory: This is memory of facts and events that can be consciously recalled. It includes episodic memory (events and experiences) and semantic memory (facts and concepts).\n",
      "   - Implicit / procedural memory: This type of memory is unconscious and involves skills and routines that are performed automatically.\n",
      "\n",
      "4. Memory Stream: This is a long-term memory module that records a comprehensive list of agents’ experiences in natural language.\n",
      "\n",
      "Each of these types of memory plays a different role in how we process, store, and recall information.\n"
     ]
    }
   ],
   "source": [
    "print(vanilla_rag_chain.invoke(query2))"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.15"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}